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Abstract — In this work we investigate the information loss in 
(nonlinear) dynamical input-output systems and provide some 
general results. In particular, we present an upper bound on the 
information loss rate, defined as the (non-negative) difference 
between the entropy rates of the jointly stationary stochastic 
processes at the input and output of the system. 

We further introduce a family of systems with vanishing 
information loss rate. It is shown that not only linear filters belong 
to that family, but - under certain circumstances - also finite- 
precision implementations of the latter, which typically consist of 
nonlinear elements. 

I. Introduction 

Transmission and processing of information is the primary 
concern in many fields of communications, signal processing, 
and machine learning. The typical impairments considered in 
these contexts are noise and interference, incomplete data sets, 
and coarse observations, eliciting both information-theoretic 
and energy-centered analyses. In contrary, the effect of de- 
terministic input-output systems on the information content, 
i.e., the entropy rate, of a signal has not yet been thoroughly 
analyzed. Still, nonlinear dynamical systems - capable of 
changing information content - are omnipresent in commu- 
nication systems in the roles of high-power amplifiers or fre- 
quency mixers. Another example is the energy detector, a low- 
complexity receiver architecture for wireless communications. 
To obtain a better understanding of the effects of these system 
components, an information-theoretic treatment is essential. 

In this paper, we establish a framework for analyzing 
the effects of discrete-time dynamical systems with a finite- 
dimensional state vector on the entropy rate of a signal. While 
the analysis of continuous-valued stochastic processes will be 
left for future work, here we focus on (jointly) stationary input 
and output processes taking values from countable alphabets. 

The data processing inequality (DPI, JT] pp. 35]) states 
that the entropy of a discrete random variable (RV) cannot 
increase by passing the RV through a static nonlinearity. It was 
shown that the same result holds for entropy rates of jointly 
stationary stochastic processes on finite alphabets, both for 
static nonlinearities [2| and general dynamical systems (3). 
Continuous-valued processes passing through linear filters 
were already analyzed by Shannon in terms of differential 
entropy rates @j> 0, which in our opinion are not adequate 
measures of information loss, cf. Section [V] The conditional 
entropy, used to characterize the information lost by passing 
a continuous RV through a static nonlinearity (6) or by 
multiplying two integers [7 |, appears to be more appropriate. 



We start by defining the information loss rate in Section HI] 
and show that this quantity is equal to the difference between 
the entropy rates of the input and output processes. This choice 
establishes the DPI for dynamical systems in Section Hill 
stating that the information loss rate is non-negative. This 
result is then complemented by an upper bound that can 
be evaluated easily. In Section [TV] we introduce a family of 
dynamical systems for which we show that the information 
loss rate vanishes. This family not only comprises a large 
class of stable linear filters (see Section IW but also their 
finite-precision counterparts, commonly used in digital signal 
processing. Aside from the latter, Section [VI] discusses some 
other examples illustrating our theoretical results. 

This document is an extended version of a paper submitted 
to an IEEE conference. 

II. Problem Statement & Preliminaries 

We consider a discrete-time regular two-sided stationary 
stochastic process X taking values from a countable set 
X. Let X n denote the RV of the rt-th sample and let 
XI = {X k ,X k+1 ,...,X n ), thus X = X^. For the 
actual value of X n we write x n . We further consider an- 
other countable set y which needs not be identical to X. 
Let H{X n ) denote the zeroth-order entropy of X n and let 
-ff(X) = lim n _>. 00 -H(X{ 1 ) denote the entropy rate of X. 
The restriction to countable sets ensures that entropies and 
entropy rates are well-defined. 

The following class of dynamical systems is treated in this 
work: 

Definition 1 (Finite-Dimensional Dynamical System). Let 

Y n = f(X%_ N ,Y^), < M,N < oo, be the RV of 
the n-th output sample of a dynamical system with a finite- 
dimensional state vector subject to the input process X. Here, 
/: X N+1 x y M — > y is a function such that the sequence of 
output samples, Y n , constitutes a two-sided stochastic process 
Y jointly stationary with X. 

Definition 2 (Information Loss Rate). Let X and Y be 
jointly stationary processes on countable sets related as in 
Definition Q] The average information lost per sample is given 
by the conditional entropy rate 

SYXIY) = lim -H(X?\Yf). (1) 

n— >oo Ti 

Characterizing the information loss as a conditional entropy 
rate is quite intuitive: The conditional entropy rate denotes the 



average number of bits per sample unknown about the input 
sequence after observing the output sequence; i.e., the average 
information lost per sample by passing the sequence through 
the system in question. 

Before proceeding with the analysis, we will introduce two 
Lemmas: 

Lemma 1. For any set of discrete RVs Z™ and any function 
f(Zk, Zi, . . .), 1 < k, I, ■ ■ ■ < n, the following holds: 



H(Z?,f(Z k ,Z h ...)) = H(Z?) 
Proof: See fl] Prob. 2.4]. 



(2) 



Lemma 2. Let X and Y be jointly stationary stochastic 
processes on countable sets. Then, for M < oo, 

H(X) = lim -H(X?\YF)= lim —HiXf, Y™). 



Proof: Clearly, 



H{X^\Y^) < H(X?) < H(X?,Yf<) 



(3) 



for all n, thus also in the limit. Now, since H(X{ L ,Y^) = 
H(X"\Yl /r )+H(Yi I ) and since all involved entities are non- 
negative, 

H(X)< lim -H(X{ l \Y 1 M )+ lim -H(YF). (4) 

n— ^oo Ti n— >oc TL 



Thus in the limit the upper and lower bound are equal and the 
proof is completed. ■ 
Since the input and output alphabets of the dynamical 
systems can be countable, it may occur that the entropy of a 
single sample becomes infinite. Yet, by the maximum entropy 
property of the uniform distribution, 



H(Yi 



<MH{Y)< lim M\og\y\ (5) 



which approaches infinity at a slower rate than lirrin^oo n. 
Thus the term on the right in approaches zero even for 
processes Y with infinite zeroth-order entropy or infinite 
entropy rate. 

III. Information Loss Rate in Dynamical Systems 

In this Section, which comprises the main contribution of 
this work, we present some general results on the information 
loss rate induced by a system satisfying Definition [T] We will 
start by proving a Theorem which essentially states that the 
information loss rate is identical to the difference of entropy 
rates: 

Theorem 1. Let X and Y be jointly stationary processes on 
countable sets related as in Definition^ Then, the information 
loss rate is given by the difference of entropy rates: 



Proof: While the proof for static functions (i.e., M = 
N = 0) is relatively simple 0, for dynamical systems we 
have to show that 

H(X\Y) = lim - (H(X{\ Y?) - H{Y?)) (7) 

n— ¥oc ft 

= lim -H(X?) - lim -HlY?) (8) 

n— >oo ti n~^oG n 

i.e., that 

lim —H(Xi, Y") = lim -H(X?). (9) 

n— »-oo ft n— >oo fl 

Consider that, for n > max{M, iV} 

fT(X?,l?) = HiY^X^Y?- 1 ) (10) 
= H(f(X™_ N , Y™Zm), X™, Y™ -1 ) (11) 



(a) 



where (a) is due to Lemma Q] By repeated application, 

H(X{\ Y") = H(X?,Y™ x{M ' N} ). 



(12) 



(13) 



Since this holds for all n > max{M, N}, it also holds in the 
limit and with Lemma [2] we obtain 

lim -H(X?,Yr x{M - N} ) = lim -H(X?) (14) 

n— >oo fl n— >oo fl 



and thus 



i?(X|Y) = H(X) - H(Y). 



(15) 



This completes the proof. ■ 
The significance of this Theorem lies in the fact that the 
information loss can be inferred by comparing the entropy 
rates of the input and output processes. Note that the same 
does not hold for differential entropy rates, as we will argue 
in Section M 

By the non-negativity of the conditional entropy rate the 
following Corollary to Theorem Q] shows that the entropy 
rate of the system output cannot be larger than the entropy 
rate of the system input. This result, originally stated in [3) 
for finite alphabets, further justifies our intuitive definition of 
information loss: 

Corollary 1 (DPI for Dynamical Systems). Let X and Y be 
jointly stationary processes on countable sets related as in 
Definition [7] Then, the entropy rate of the output process Y 
cannot be larger than the entropy rate of the input process X, 
i.e., 



H{Y) < H(X). 



(16) 



H(X\Y) = H(X) - H(Y) 



(6) 



Generally, the computation of entropy rates is a non-trivial 
problem, where closed-form solutions exist only for simple 
processes (e.g., Markov chains). Since functions of stochastic 
processes rarely allow such a simplified treatment, the avail- 
ability of bounds is of vital importance. We will thus present 
an upper bound on the information loss rate, which is simple 
to evaluate: 



Theorem 2 (Upper Bound). Let X and Y be jointly stationary 
processes on countable sets related as in Definition [7J Then, 
the information loss rate is bounded by 

H{X\Y)< max log \fg X [fg{x)]\ (17) 

(x,9)£XxT 

where 7" = X N x y , 8 e T are the possible values of the 
RV 9 n = {^"Zjvj Y™Zm\> an d $g X \\ denotes the preimage 
under fg, an instantiation of the function /e„(-) = /(-,On). 

Proof: 

H(X\Y) = lim — (H(X?, Y7 1 ) — H(Y7 1 )) (18) 

n— >oo 7J, 

(a) 



= } lim - (j^HiX^Xr 1 ^- 1 ) 

\j=l 

i=l ) 
'b) 1 / ™ 

< lim y^- 1 ,!?- 1 ) 

n— >oo 72 \ * — • 

\i=l 

i=l / 
n 

= lim -VffpQlXr 1 ,*?) 

Ti -4fYi n * * 



(19) 



(20) 
(21) 



where (a) is due to the chain rule of entropy and (b) is due 
to the fact that conditioning reduces entropy. The expression 
under the sum in (t2TT> is a non-negative decreasing sequence in 
i and thus has a limit. We use the Cesaio mean fl] Thm. 4.2.3] 
and obtain 



ff(X|Y)< lim H(X n \X?- l ,Yf 

n— >oo 

< H(X n \X™Z N ,Y™_ M ) 
= H(X n \Y n ,e n ). 



(22) 

(23) 
(24) 



We now replace Y n = f(X n ,<d n ) = f @n (X n ), where we 
treat the collection of all previous RVs influencing Y n as a 
(random) parameter 9 rl of the function. This approach lets 
us interpret the dynamical system as a parameterized static 
system /e„: X — > y, where we let 6 n take values 9 from 
T = X N xy M . We thus continue 

H(X\Y) < H(X n \f &n (X n ),Q n ) (25) 
H(X n \fe(x),0)Pr(X n = x,e n = 6) 

< lo § I fo 1 if 8 (x)] I Pr {X n =x,Q n = 6) 

{x,8)£XxT 

- , i^ a v i°s I /fT 1 [/<?(>)] I 



(26) 



where (c) is due to conditioning and the maximum entropy 
property of the uniform distribution over an alphabet size equal 
to the cardinality of the preimage under Maximizing over 
all possible x and parameter values 9 completes the proof. ■ 
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Fig. 1. 


Cascade of systems 





This result can be interpreted as relating the information 
loss rate of a dynamical system to the information loss rate 
induced by a static function. In particular, we let the static 
function be parameterized by previous input and output values 
taking effect on Y n and upper bound the information loss 
rate by the maximum cardinality of the preimage under fg. 
While this upper bound may be rather conservative, it is 
particularly simple to evaluate if the system function from 
Definition Q] is available. We will illustrate the use of this 
result in Section IVl-CI 

Finally, we present a result about the cascade of systems 
(see Fig. [TJ: 

Theorem 3 (Cascading Systems). Let X, Y, and Z be jointly 
stationary stochastic processes on countable sets X, y, and 
Z, respectively, where Y is generated by passing X through a 
system satisfying Definition [7J and Z is generated by passing 
Y through another such system. Then, the information loss 
rate induced by the cascade is 

F(X|Z) = i?(X|Y) + 7l(Y|Z). (27) 

Proof: By using Theorem[T]_ff(X|Z) can be written as 

ff(X|Z) = ff(X)-5"(Z) (28) 
= H(X) - H(Y) + H(Y) - H(Z) (29) 
= H(X\Y)+H(Y\Z). (30) 



IV. Partially Invertible Systems 

We now impose an additional restriction on the system 
function in Definition Q] This additional restriction defines a 
family of systems for which the information loss rate can be 
shown to vanish. 

Definition 3 (Partially invertible system). A system satisfying 
Definition Q] is partially invertible if there exists a function 
f inv : X N x y M+1 -> X such that 

X n = finv{X n _ N , Y n _ M ") = finv(Y n , ©n) = /g n (Y n ) ■ 

(31) 

In other words, a system is partially invertible if its parameter- 
ized static function /q ;i is invertible for all possible parameter 
values 9 G T. 

We will now argue that for this class of systems the 
information loss rate vanishes. We start by showing that the 
total information loss for a finite-length input sequence X^ 



after observing an output sequence Y± of the same length 
remains bounded independently of the sequence length: 

Theorem 4. Let Xf and , K > max{M , N}, be two 
finite-length sequences of jointly stationary processes X and 
Y on countable sets X and y, respectively, where Y is 
generated by passing X through a partially invertible system. 
Then, the information loss becomes 

H{X?\Y*) = H{X^ {M ' N} \Y^). (32) 

Proof: We start by noticing that H(Xf c \Y^ c ) = 
H(Xf, Yf) - H(Y 1 K ) and 

H(X?,Y?) = H{X K ,X?- X ,Y*) (33) 
= H(fi nv (X K _l ! ,Y K _ M ),X 1 1 ,Y 1 ) 

^^(Xf- 1 ,^) (34) 

where (a) is due to Lemma [T] Repeating this step a number 
of times yields 

ff(Xf , Yf) = H ( X ^ M < N \ Yf). (35) 

Subtracting H{Yy) completes the proof. ■ 
Note that even though /e n is invertible for all parameter 
values 6, this does only mean that H(X n \f@ n (X n ), Q n ) = 0, 
while H(X n \f@ n (X n )) > 0. This corresponds to the state- 
ment of Theorem [4] where for n < max{M, N} 0„ has 
to be considered unknown. It is also important to note that 
H(Xf\Yf) ^ H(Xf )-H(Yf). While the information loss 
rate is equal to the difference of entropy rates (cf. Theorem[TJ, 
it does not hold generally that the difference of joint entropies 
is equal to the joint conditional entropy. 

We will now make use of this result in proving that partially 
invertible systems have a vanishing information loss rate: 

Corollary 2. Let X and Y be jointly stationary processes on 
countable sets related as in Theorem [4] Then, the information 
loss rate induced by passing the process X through the system 
vanishes, i.e., 

ff(X|Y) = 0. (36) 

Proof: We provide two proofs for this Corollary. For the 
first, note that irrespective of 9 the inverse function fg 1 always 
exists by Definition [3] With Theorem [2] this immediately leads 
to H(X\Y) = 0. 

For the second proof we note that Theorem |4] holds for all 
K, thus also in the limit. With Definition [2] we can therefore 
write the information loss rate as: 

BYXlY) = lim -H(X?\Yf) (37) 

n—too 71 

= lim -H{X™* {M ' N} \Y?) (38) 

n— >oo ft 

< lim l H rx^ M ' K h = (39) 

n— >oo ji 

by similar arguments as in the proof of Lemma [2] ■ 
An immediate consequence of this important Corollary is 
that, except for the initial samples _^™ ax ^ M < N ' a ft er starting 



the observation of Y (cf. Theorem 3J, the remaining infor- 
mation of the input process can be recovered by observing 
the output process. Note that this not necessarily means 
that the input process can be reconstructed perfectly, even 
if reconstruction errors are allowed in the first max{M, N} 
samples. An illustrative example for this fact will be given in 
Section IVLBl 

V. The Case of Linear Filters 

It is interesting to note that an important subclass of 
discrete-time stable causal linear filters falls in the category 
of partially invertible systems, as long as the input and output 
alphabets are countable. An example where the latter condition 
is satisfied is given if the input process and the coefficients 
take values from the field of rational numbers. This subclass, 
powerful enough to cover most applications 0, comprises 
filters with a finite-dimensional state vector described by 
constant-coefficient difference equations: 

N M 
k=0 1=1 

As noted in (51, stability of the filter guarantees that for a 
stationary input process the output process is stationary and 
that Definition Q] applies. By rearranging the terms in d40b it 
can be verified that this subclass of linear systems satisfies 
the definition of partially invertible systems and, thus, has a 
vanishing information loss rate. 

It is noteworthy that this property is independent of the 
minimum-phase property (cf. iflOl pp. 280]) of linear filters, 
which ensures that the filter has a stable and causal inverse. 
Indeed, for filters which are not minimum-phase, the partial 
inverse function /; nv used in Definition [3] describes a causal, 
but unstable linear filter. As a consequence, to an arbitrary 
stationary stochastic input process, the inverse filter described 
by fmv ma Y respond with a non-stationary output process; 
however, the response to Y will be X. 

A signal space model may effectively illustrate these con- 
siderations: Let X°° and 3^°° be the spaces of stationary input 
and output processes X and Y, respectively, and let F{-} be 
the (linear) operator mapping each element of X°° to y°°. 
By restricting our attention to regular stochastic processes, 
i.e., processes which cannot have periodic components, the 
operator F{-} is injective. As a consequence, for each ele- 
ment of y°° there exists at most one element in X°° such 
that Y = F{X}. Note, however, that there are stationary 
stochastic processes in y°° which are not images of elements 
in X°° . Only if F{-} is such that it describes a stable, causal 
minimum-phase system, i.e., has a stable and causal inverse, 
y°° contains only images of elements from X°°. 

This complements a result already introduced by Shan- 
non 01, which states that the change in differential entropy rate 
caused by stable, causal linear filtering of continuous-valued 
stationary processes is independent of the process statistics. In 
particular, for a linear filter with frequency response G(e J0 ) the 



h(Y) = h(X) 



ln|G(e> e )|d0 



(41) 



differential entropy rate of the output is given by |5] pp. 663] 

2^ 

It can be shown (see, e.g., [11]) that the integral above 
evaluates to In |&o|+53v| Zi |>i m \ Zi \> where z% are the zeros of 
the transfer function G{z). For causal minimum-phase systems 
(\zi\ < 1 Vi) with bo = 1, the differential entropy rates for the 
input and output process are equal. This result was recently 
verified by 1121 . which analyzed the invariance of entropy 
rates for all-pole filters. Scaling the transfer function of such 
a filter such that bo ^ 1 leads to h (X) ^ h (Y), despite the 
fact that by scaling no information is lost. Conversely, it is 
easily possible that h (X) = h (Y) for systems which destroy 
information. Therefore, we believe that differential entropies 
and differential entropy rates are not adequate measures for 
information loss. Future investigations will show if alternative 
descriptions for continuous-valued processes will yield more 
appropriate characterizations. 

VI. Other Examples 

While the case of linear filters is a particularly interesting 
one, the restriction to countable input and output alphabets 
suggests further examples illustrating the application of our 
theoretical results. 

A. Example 1: Finite-Precision Linear Filters 

The first example considers an extension to the subclass of 
discrete-time linear filters discussed in Section [V] In many 
practical applications in digital signal processing linear filters 
are implemented with finite-precision number representations 
only. We thus assume that both input process and filter 
coefficients take values from a finite set. For example, X may 
be a finite subset of the rational numbers Q, closed under 
modulo-addition. Multiplying two values from that set, e.g., by 
multiplying an input sample with a filter coefficient, typically 
yields a result not representable in X. As a consequence, 
after every multiplication a quantizer is necessary, essentially 
truncating the additional bits resulting from multiplication. Let 
the quantizer be described by a function Q: K — >• X with 
Q(a + X n ) = Q(a) © X n if X n E X, where © denotes 
modulo-addition (e.g., iflOl pp. 373]). With this (l40b changes 
to 



A' 



M 



Y n = @Q(b k X n . k )®(§)Q(a l Y n - l ) (42) 



fe=0 



or 

(N M \ 

06 fe X„_ fe ©0a i Y n _ i (43) 
fc=0 1=1 / 

depending whether quantization is performed after multiplica- 
tion or after accumulation (in the latter case, the intermediate 
results are represented in a larger set X'). Note that due to 
modulo-addition the result Y n remains in X. 

We will now focus on filters with bo = 1. For filters with 
infinite precision this can be done without loss of generality 
by considering a constant gain factor bo and by normalizing 
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Fig. 2. Discrete-Time Hammerstein System. For g(-) = ( ) 2 and if the linear 
filter is a moving-average filter, this corresponds to a discretized model of the 
energy detector. 



all bk coefficients. However, this gain normalization poses 
a restriction in the finite-precision case since b k /b is not 
necessarily an element of X. With bo = 1 (l42l and (|43T > change 
to 



N 



M 



Y n = X n ffi Q (b k X n _ k ) © Q (oir„_i) (44) 



\ k =i 



1=1 



and 



N 



M 



Y n = X n ® Q b k X n _ k - ojFn-i (45) 
\fc=i i=i ) 

by the property of the quantizer. From this it can be seen 
that either implementation is partially invertible (the terms 
in parentheses in (PRi i and d45l ) are both in X, and modulo- 
addition has an inverse element). Consequently, even filters 
with nonlinear elements can be shown to preserve information 
under certain circumstances despite the fact that the quantizer 
function is non-injective. 

B. Example 2: Multiplying Consecutive Inputs 

Another nonlinear system satisfying Definition [3] is given 
by the following input-output relationship: 



Y n X n X n —i 



(46) 
if 



The partial inverse in this case would be X n = 
X n -\ 7^ 0, while for X n -\ = no such inverse exists. 
Therefore, this example represents a class of systems whose 
partial invertibility depends on the alphabet X of the stochastic 
process. If the process X is such that X does not contain the 
element 0, the partial inverse exists and we obtain for X n , 
n > 1: 



X„ 



, Xi Hfc=l Y 2k , 



for odd n 
for even n 



(47) 



Indeed, since all X n , n > 1, can be computed from X\ 
and Y[\ we obtain H{X%\Y?) = HiX^Yl 1 ) which is in 
perfect accordance with Theorem [4] Reconstruction of X is 
thus possible up to an unknown X\. Note, however, that this 
unknown sample influences the whole reconstructed sequence 
as shown in d47b . Thus, even though the information loss rate 
vanishes, perfect reconstruction of any subsequence of X is 
impossible by observing the output process Y only. 

C. Example 3: Hammerstein Systems 

A final example considers a simple special case of a 
nonlinear dynamical system, namely, a cascade of a static 
nonlinearity and a linear filter lfl3l . Such a cascade, usually 
referred to as Hammerstein system, is depicted in Fig. [2] A 



practical example of such a Hammerstein system is the energy 
detector, a popular low-complexity receiver architecture in 
wireless communications. In the discrete-time case the input- 
output relationship is given by 

JV M 

Y n = b k g{X n - k ) + a l Y n-i- (48) 

fc=o ;=i 

As it is easily seen, this system is partially invertible if and 
only if the function g has an inverse. If g is not invertible, we 
obtain in the light of Theorem [2] 

Y n = /©„ (X n ) = b g(X n ) + C e „ (49) 

where Ce„ is a constant depending on the random parameter 
0„. With this and f^lfeix)] = g^ 1 [g{x)] for all x e X, 
9 G T we obtain an upper bound on the information loss rate: 

ff(X|Y) < max log Ig- 1 !.^)] I (50) 

Interestingly, the structure of this system allows a simplified 
analysis: Since the information loss rate of a cascade of 
systems is equal to the sum of individual information loss 
rates (cf. Theorem [3]) we can analyze both constituent systems 
separately. The linear filter was already shown to preserve 
full information, so any information loss will be caused by 
the static nonlinearity, i.e., if(X|Y) = ii"(X|V). This is in 
accordance with the observation that the Hammerstein system 
is partially invertible if the static nonlinearity is invertible. 

For static nonlinearities the analytic treatment of infor- 
mation loss is simple compared to dynamical systems. In 
particular, for an independent, identically distributed (iid) input 
process X the information loss rate can be shown to be equal 
to the zeroth-order conditional entropy, H(X\V), while for 
a general stationary process this quantity acts as an upper 
bound |2 1 . The upper bound from Theorem [2] turns out to be 
even more general, since it also provides an upper bound on 
H(X\V) in the case of an iid input process (cf. Theorem 4 
in (6)). An in-depth analysis of the interplay between these 
bounds is the object of future work. 

VII. Conclusion 

In this work we have presented general results on the 
information loss of dynamical systems for stationary stochastic 
input and output processes on countable alphabets. Further- 
more, we have extended the proof of the data processing 



inequality stating that the entropy rate at the output of the 
system cannot be larger than the entropy rate at the input and 
have derived an upper bound on the information loss rate. The 
additivity of information loss rates for cascaded systems could 
be shown, too. 

We have further identified a family of systems for which 
this upper bound is zero, i.e., for which the information loss 
rate vanishes. Not only linear filters belong to that family, but 
also their nonlinear counterparts common in finite-precision 
signal processing. 

Future research will extend these results to the case of 
continuous-valued stochastic processes and the application to 
common nonlinear systems, e.g., Volterra models. 
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